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Abstract: A lossy compression algorithm is presented for astronomical images that protects photomet- 
ric integrity for detected point sources at a user-defined level of statistical tolerance. PHOTZIP works 
by modeling, smoothing, and then compressing the astronomical background behind self-detected point 
sources, while completely preserving values in and around those sources. The algorithm also guaranties 
a maximum absolute difference (in terms of a) between each compressed and original background pixel, 
allowing users to control quality and lossiness. For present purposes, PHOTZIP has been tailored to 
FITS format and is freely available over the web. PHOTOZIP has been tested over a broad range 
of astronomical imagery and is in routine use by the Night Sky Live (NSL) project for compression 
of all-sky FITS images. Compression factors depend on source densities, but for the canonical NSL 
implementation, a PHOTZIP (and subsequently GZIP or BZIP2) compressed file is typically 20% of 
its uncompressed size. 

Keywords: methods: data analysis - techniques: image processing, photometric - astronomical data 
bases: miscellaneous 



1 Introduction 

Astronomy research images are now almost exclusively 
digital. There is an increasing need to store these im- 
ages and to send them over the Internet. Average 
bandwidth and storage capacity, although increasing, 
continue to be a bottleneck that frequently limits scien- 
tific exploitation of these images. Lossless compression 
of astronomical image files therefore creates a clear ad- 
vantage over non- compression since it increases effec- 
tive storage and bandwidth and so bolsters scientific 
utility. Lossy compression is more controversial, how- 
ever, as the scientific value of the data lost in the com- 
pressed images must be weighed against the scientific 
value of the data gained by the extra bandwidth and 
storage space. 

One of the premier scientific uses for astronomical 
images is photometry. Whether detecting the pres- 
ence of, for example, distant supernovae, Local Group 
microlensing, binary star variability, or planetary tran- 
sits, the need for photometric accuracy in astronomy 
images remains a primary objective for many astro- 
nomical research projects. 

Frequently, the photometric value of an astronom- 
ical image is concentrated in the point sources in the 
image. Conversely, the bulk of the image size is con- 
centrated in the background behind these sources. When 
the number of pixels taken up by sources is small com- 
pared to the number of pixels that compose the back- 
ground, it becomes possible to significantly compress 
the image size while preserving a certain level of pho- 
tometric integrity. 

While lossless compression algorithms preserve 100% 
of the signal, lossy algorithms can provide a b etter 
compression factor while losing some of the signal llPress I 



119921 : iFixsen et all 1200(1 IWatson 1 12002ft . However, 
lossy compression algorithms tend to convolve science 
with art. Astronomy-specific lossy data compression 
algorithms are n ot new, and many have been proposed 
and widely used (iPence ll994l : IPence et al. l200C i 2002: 
| Whitelll99l: iPress I Il992fi HCOMPRESS JWhitel 
1992) is a commonly used FITS compression program 
based on a two-dimentional Haar wavelet transform. 
This compression program is fast and provides a rela- 
tively high compression factor. However, while provid- 
ing the user control of the lossiness\compression trade- 
off, HCOMPRESS provides a limited control over the 
type of the signal that is lost in the compression \ 
decompression process. 

The most similar compression approach to that dis- 
cussed here is the i nsightful FIT S compression pro- 
gram FITSPRESS dPresslH99l . FITSPRESS is a 
wavelet based compression algorithm that has, among 
other things, sensitivity to preserving the brightest im- 
age pixels. 

Lossy compression (e.g. JPG, HCOMPRESS) tends 
to be controlled by parameters that have no scien- 
tific or statistical meaning. In contrast, the PHOTZIP 
FITS compression algorithm provides the user control 
of the preserved \ lost s ignal in scientific terms. Like 
ijVeran fc Wright Ill994|) . PHOTZIP should be used as 
a preprocessor to multi-purpose lossless compression 
algorithms (e.g. LZW) and allows those algorithms a 
higher compression factor by losing some of the sig- 
nal, but ensures that only background information of 
the image will lose some of its signal. The algorithm 
provides an interface which can be used in order to 
define, in terms of a, the criteria for a pixel to be con- 
sidered as background, and guarantees a user-defined 
maximum absolute difference (also in terms of a) for 
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any background pixel that loses signal. In Section 2 we 
describe the algorithm, in Section 3 we present ways 
to improve the compression factor, in Section 4 we dis- 
cuss photometric integrity and in Section 5 we discuss 
the performance of the algorithm. 

2 Lossy Compression That Pre- 
serves Bright Signals 

Since a main purpose of our algorithm is to allow lossi- 
ness only for background pixels, the first stage of the 
algorithm is to determine for every pixel in the frame 
whether or not it is a background pixel. For PHOTZIP, 
we achieve this by using square window median filter- 
ing. The background value of each pixel is determined 
to be the median of the values of all pixels within its 
window. Assuming that the bias is zero, the gain is 1 
electron and the read noise is negligible, we can com- 
pute a for each pixel by a = -J B x , y , where B x , y is 
the estimated background of the pixel at coordinates 
(x,y). 

The background computation stage can be summa- 
rized by the following algorithm: 

1. for y <— 1 to height do 

2. for x <— 1 to width do 

3. B x .y * median of C x — s,y— s, Cx— s-\-i,y— s, 

• • • , Cx-s,y-s + l, . . . , C x-\-s ,y-\-s 

4. — \j B Xj y 

5. end for 

6. end for 

C x ,y is the value of the pixel at the coordinates (x,y), 
and B x , v is the estimated background value of the pixel 
at the coordinates (x, y). s is the half width size of the 
window. The nested loops in lines 1 and 2 make sure 
that the background computation is done for every 
pixel. Therefore, every pixel in the frame is attached to 
the background value that is the median of the pixels 
in the (2s+l) x (2s+l) window centered on (x, y). Af- 
ter the estimated background is computed for a pixel, 
the a of that pixel is also computed by a — B x , y . 
In our current implementation, PHOTZIP assumes a 
default half-width size of 10, although the user can 
specify any half-width size. 

When an image has a great many pixels, a pixel- 
by-pixel median computation might turn into a com- 
putationally expensive task. For an N x N frame we 
will need to compute the median value of N 2 square 
windows, when each window has (2s + 1) x (2s + 1) 
pixels. For instance, for a 1024 x 1024 frame, when 
computing the background values of the pixels using 
a windows with half width size of 10, The algorithm 
will need to compute 1,048,576 (1024 x 1024) times 
a median of 441 (21 x 21) numbers. In order to re- 
duce needed processing power, we chose to compute 
the background value not for every pixel, but for win- 
dows of 5 x 5 in which the median value is computed 
only for the "leader" pixel, which is the pixel at the 
center of the window. All other pixels in that 5x5 
window are associated with the same background value 



as their "leader". Since backgrounds do not tend to 
drastically change over small windows, we consider this 
technique as an acceptable approximation. In order to 
compute the median efficiently, we use the common al- 
gor ithm for finding a me dian in linear time described 
by ijCorman et al. Ill99(l> . 

Once the background value (and hence o) is deter- 
mined for every pixel in the frame, the pixel values are 
quantized. The quantization stage is affected by two 
parameters. The first, d, is the minimum brightness, 
in terms of a, of a pixel such that every pixel that is 
less bright is classified as background. The second pa- 
rameter, b, is the maximum absolute difference, also 
in terms of a, that is allowed between a background 
pixel in the original image and the same pixel in the 
compressed\decompressed image. The basic idea of 
the quantization is that a value of a certain pixel (x, y) 
is quantized only if it is lower than B Xty + d-a Xty , so the 
lossiness of the algorithm does not affect sufficiently 
bright pixels. For every pixel (x,y) in the frame, we 
first check if it is brighter than B x<y + d ■ cr XlV . If the 
pixel meets this criterion then it is not quantized and 
does not loose any of its signal. 

The criteria for a pixel to be quantized is 

C X ,y <C B x ^y -(- d • (J X ,y . (1) 

The top-level algorithm for selecting the pixels that 
should be quantized is: 

1. for y <— 1 to height do 

2. for x <— 1 to width do 

3. if C x>y — B x>y < d ■ <j x>y then 

4. C x ,y <— quantize(C x , y ,a x> y,b) 

5. end for 

6. end for 

In line 4, the subroutine quantize is called in order 
to perform the quantization of any pixel that does not 
meet the criteria of line 3. The size of the quanta 
used is 2 ■ 2^ log2 b '"^ , where b is a user-defined posi- 
tive value (b > 0) such that the maximum absolute 
difference between the value of the pixel in the orig- 
inal frame, and the value of the same pixel in the 
compressed\decompressed frame cannot be greater than 

The quantization algorithm is simply: 

quantize (c,a,b) 

1. quantum .size <- 2 ■ 2 [log2 b " Ti 

2. quantized .value <— quantum.size-Roundt — - ■. — ) 

* ^ v quantum-Size ' 

3. retum(quantized_value) 

c is the pixel's value and b is the maximum absolute 
difference (in terms of a) that is allowed between a 
pixel in the original frame and the same pixel in the 
compressed\decompressed frame. Round is a function 
that rounds its argument to the nearest integer. This 
quantization symmetrically increases or decreases pixel 
values. The interval of each quantum is [quantized_value— 
2^°^ h - a \quantized.value + 2^2 <>^J]. Since c i s a 
value within this interval, the absolute difference \c — 
quantizedjvalue\ can never be greater than 2^ log2 b '"i . 



3 



Since 2^ log2 b '"^ < ba, the absolute difference between a 
pixel value in the compressed\decompressed frame and 
the same pixel in the original frame can not be greater 
than ba. It might seem that simple quanta at the 
size of 2ba can improve the compression factor by pro- 
viding larger quantum sizes, yet still comply with the 
maximum absolute difference criteria. However, since 
a is different for every pixel, this might lead to a large 
variance in the quantum sizes and therefore severely 
reduce the compression factor. For instance, suppose 
that we have two pixels with values of 97 and 99, and 
a of 9 and 10 respectively. Assuming 6 = 1, with quan- 
tum size of 2ba the first value, 97, will be quantized 
using quantum sizes of 18 and the second value, 99, 
will be quantized using quantum sizes of 20. After the 
quantization process, the values will be, therefore, 90 
and 100 respectively. However, if using quantum size 
of 2^ loS2 t, tT J + 1 j the quantization of both values is done 
using the same quantum size (16 in this case). After 
the quantization process, the value of both pixels will 
be 96, which increases the compression potential of 
pattern matching based compression algorithms. The 
low variance of quantum sizes leads to smaller variance 
of quantized values, which is an important factor in the 
performance of many multi-purpose compression algo- 
rithms. Examples of the differences in the compres- 
sion factor (using BZIP2 with PHOTZIP) when using 
quantum size of 2ba and quantum size of 2 ■ 2^ log2 b ' CT J 
are listed in the following table: 
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8 73.6% 
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8 76.5% 


66.1% 
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2 


10 77.3% 


67.0% 



The file that was used for the samples is "ci040325ut 
005115p.fits" which is an unsigned integer FITS image 
of size of 2102 KB. This file is discussed more thor- 
oughly in Section 5. 

The function quantize might fail when a is equal 
to zero. However, when a = 0, the condition stated 
in line 3 of the top level algorithm cannot be satisfied 
and the function quantize is not invoked. 

Since the quantization is symmetric, the mean of 
the pixel values should be preserved in the compressed 
\ decompressed frame. Assuming a normal distribu- 
tion for the pixel values in the original frame, the me- 
dian of the original frame should be equal to the mean. 
As the mean is preserved, the median of the original 
frame can be taken from the mean of the compressed 
\ decompressed frame. 

Since the number of integers within each 
[quantized.value-2 110 ^ i, ' <Tj , quantized.value+2 110 ^ 6,<tJ ] 
quantum is odd, there is always one integer value that 
can be quantized either up or down. A systematic 
policy that always quantizes these values in the same 
fashion (either up or down) will cause a systematic bias 
to the mean. Thus, in order to avoid statistical bias- 
ing, the standard Round function al ways rounds ha lf 
integers to the nearest even integer IWolfram I 1999). 
Due to that behavior of Round, if the quantized value 
is exactly in the center of the interval, it can either 
gain or lose value so that systematic bias is avoided. 



3 Compressing Non- Astronomical 
Edges and Artifacts 

In practice, image pixels might exist that are clearly 
non-astronomical and do not need to be preserved. 
One example is the edge of a frame that is not exposed 
to the sky but dominated by other sources of noise. 
In some cases, pixels such as these would not have 
a high pixel value, but since they are in a relatively 
dim area of the frame, their value is high compared 
to their background. Pixel values of non-astronomical 
edges can sometimes have a high variance, so some of 
the pixels can be significantly brighter than other pix- 
els around them. Since a is determined by the local 
background of each pixel, a low background value of 
a pixel leads to a low a, and a low a leads to a low 
da. Given their low da, the algorithm would normally 
preserve the signal of pixels which do not have a high 
value by meeting the C x , y > B x<y + a X:V d criteria due 
to the extremely low value of their background. This 
might result in an unnecessarily low compression fac- 
tor. In order to allow a user to avoid this trap, we 
set another optional parameter t that sets a threshold 
value for pixels that are allowed to lose signal. The 
threshold value is set by t in terms of the median value 
of the frame. For instance, if t = 1.5 then any pixel 
with value less than l| times the value of the median 
pixel will not be required to preserve its value even if 
it meets the G x , y > B Xty +a XtV d criteria. Therefore, to 
include this, line 3 of the top level algorithm presented 
in section 2 should be changed to: 
3. if C x , y — B x>y < d ■ a x , y or G x , v < t- (median of 

Cl,l, 6*2,1, • • • , Cl,2, . • • , C 'width, height) then 

We found this para meter is very effective in compress- 
ing Night Sky Live (iNemiroff et al. l2005ft FITS frames, 
in which a significant portion of the frame is not di- 
rectly exposed to the sky, and therefore consists of 
scattered low values that have no scientific utility. If 
those areas were uniform, then no special action would 
need to be taken and the pixel values would be natu- 
rally quantized. 

4 Photometric Integrity 

Many astronomical images are sparsely populated. When 
the number of pixels taken up by point sources is small 
compared to the number of pixels that compose the 
background, it becomes possible to significantly com- 
press the image while preserving a useful level of pho- 
tometric integrity. In this Section we estimate this 
level of photometric integrity. 

When symmetrically quantizing the background pix- 
els, the mean of the pixel values is preserved in a level 
that depends on the number of background pixels av- 
eraged. A high number of averaged pixels will lead to 
a low standard error. 

When a pixel value is quantized, the absolute dif- 
ference between the pixel value in the original frame 
and the pixel value in the compressed\decompressed 
frame is bound by 2 Llog2 6 ' ctJ . Let A be the differ- 
ence between the value of a certain pixel (x, y) in the 
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original frame and the value of the same pixel in the 
compressed\decompressed frame such that: 
A = C x ,y— quantize(Ca;,j/, cr, 6) 

Let C be the mean of n pixel values in the compressed 
\ decompressed frame such that: 

EIU^-a*) _EL g * EL* 



,EI 



is the absolute difference between the 



C-- 
So 

mean of the original pixel values and the mean of the 
quantized pixel values. 

Let q — 2^ log2 b ' a ^ . Since the function quantize de- 
scribed in section 2 guaranties that the absolute dif- 
ference |A| is bound by q, A can be any value within 
the interval [— q, q]. Since A is uniformly distributed 
in the interval [—q, q], the expected value of A is zero, 
and the standard deviation of A is A=. 

V 3 

stddev(A) = -^j (stddev(X) is defined as the stan- 
dard deviation of a random variable X). 
The variance of A is: 
var{A) = stddev 2 (A) = 

Let A be the absolute differences between the mean of 
the original values and mean of the quantized values 
of some n pixels. 

EL A 



A = 
var(A) - 

n-var(A) 

^ 



En A^ 
i— 1 n 

; uar (EL ^0 = 

_ uar(A) 



EL = Er=i 



t>ar ( A^ ) 



stddev(A) = y/var(A) = y 



uar(A) 
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2 L1°E2 b <*J 

We can see that the most likely value of the differ- 
ence between the mean of the original values and the 
mean of the quantized values is zero. We can also see 
that the standard deviation of that difference decreases 
in an asymptotically order, and approaches zero when 
n — *• co. For instance, if we choose to use 6=1, the 
standard deviation of the mean of 1600 pixels is equal 
to L L J , which is smaller than -^cr. In typical in- 

V3-1600 ' 69 J ^ 

teger FITS frames, the standard deviation in this case 
will usually be smaller than 1, which indicates that the 
mean of a group of many pixels in the original frame 
will practically be equal to the mean of those pixels 
in the compressed\decompressed frame. Since anal- 
ysis of background pixels usually involves very many 
pixels, the mean will be only negligibly affected by the 
presented quantization. 

This analysis is based on two assumptions men- 
tioned above: 1. The mean of the difference between 
the original values and the quantized values is zero, 
and 2. The distri bution of the d ifferences is approxi- 
mately uniform. ( Wats on II2002T) suggests that this is 
not necessarily true for quanta sizes greater than 2a. 
Therefore, the algorithm can guarantee the preserva- 
tion of the mean only for values of b such that b < 1. 

The same approach also applies to the sum: 
Let So be sum of n pixels values in the original frame 
and S q be the sum of the n quantized pixel values in 
the compressed \ decompressed frame. 

*o = EL^ 

s 9 = EL 1 (c l -A l) = EL 1 c l -EL 1 A l 



s "~ Eii« el °> 

Since the most likely value of Ai is 0, the most likely 
value of #%=J is also 0. The standard deviation of 

E, =1 C * 



the expression 



EL A - 
EL C * 



stddev( 



EL** . 
EL<V~ 
E ' 



n-stddev(A) _ 2 L lo S2 b -"l 
n-C ~ V3n-C 



Where C = 

For instance, when computing the sum of 1600 quan- 
tized pixels when C = 1000 and 6 = 1, will provide 

So V3 -1600 -1000 ' 69000 

Why not quantize all of the pixels, including the 
bright pixels of obvious point sources? The practi- 
cal risk here is that point sources so quantized might 
involve a small number of pixels and lead to a large 
error. For instance, if we have only 5 bright object 
pixels which can be relied on for photometry of a cer- 
tain astronomical object, the standard deviation of the 

2 L lo K2 b -< 



mean of the pixel values is 



which might be 



~ 0.26&<t. Since the pixels of the astronomical objects 
are usually the most interesting to science, the pre- 
sented algorithm allows a user to completely preserve 
their values through the quantization process. How- 
ever, since signal loss due to noise might be greater 
than signal loss due to the quantization process, a user 
might choose to set d to oo in order to force the algo- 
rithm to quantize all the pixels in the frame. This will 
increase the compression factor (examples are given in 
Section 5), but will also result in additional signal loss. 
Even though the additional signal loss caused by the 
quantization can be smaller than the signal loss already 
caused by the background noise, since those pixels are 
the most valuable for science we chose to allow abso- 
lute preservation of their values. This also allows a 
user to use extreme values of b, while still preserving 
the point spread functions of the sources. Quantizing 
the brightest pixels can not only change the raw sum of 
these pixels, but also the point spread function. Note 
that the point spread function shapes can be useful for 
everything from photometry to discerning cosmic-rays. 

Also, in some case, the photometric brightness of a 
source can be estimated from the brightest source pix- 
els alone, after background subtraction. Such bright- 
ness measures are particularly useful when the back- 
ground changes significantly and unpredictably over 
the wings of the PSF. One project that uses such pho- 
tometric measures is the Night Sky Live project, which 
struggles against a ill-behaved sloping background and 
so records quantities like CI, C5, C9, etc., meaning 
the level of the brightest pixel, the average of the five 
brightest pixels, etc. Preserving the brightest pixel val- 
ues then specifically enables such photometry schemes. 

Lastly, in cases of sub-pixel point spread functions, 
when only the single brightest pixel is measured, not 
preserving the brightest pixel could lead to a loss of 
any signal of on the order of ba. 

Concentrating again on the background, using a 
large number of pixels reduces the affect of the quan- 
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tization on the mean, so that the mean of the com- 
pressed \ decompressed frame should be practically 
equal to the mean of the original frame. Assuming a 
normal distribution of the pixel values, the median of 
the original frame should be equal to the mean of the 
original frame. Since the mean of the original frame is 
nearly preserved, the median of the original frame can 
be equated to the mean of the compressed\decompressed 
frame. However, since the value of the median pixel 
is changed, computing the median directly from the 
compressed\decompressed frame produces a different 
value than the median of the original frame. There- 
fore, the median preservation is strongly subject to 
the assumption of normal distribution of the pixel val- 
ues. Practically, the number of pixel values involved in 
the median computation is finite, and can actually be 
smaller than the quantum size. As the number of pix- 
els involved is smaller, the perturbation to the median 
is higher. In the worst case scenario, the perturbation 
can be ba. In some cases, this perturbation may not 
easily avoided, but since b is user defined, a user can 
know and control the maximum perturbation allowed 
for the median. Due to that perturbation, users of 
the compressed\decompressed data might need to ad- 
just their photometry algorithms accordingly. In order 
to avoid computing the background using the median 
or mode, users might choose, for instance, to use the 
mean with outlier rejection. 

A higher d quantizes more and brighter pixels. There- 
fore a lower d makes fainter objects peak above the 
quantization and hence easier to detect in the com- 
pressed \ decompressed frame. However, a relatively 
high b might also contribute to the lossiness of faint ob- 
jects. For instance, when 6=1, faint objects with pixel 
values such that C x , y — B x>y < a might be completely 
gone from the compressed\decompressed frame. There- 
fore, one using the algorithm should consider her ex- 
pectations of photometry integrity and set the b and d 
parameters accordingly. We believe that an advantage 
of our method is that it provides a guarantee on the 
lost signal in standard statistical terms. 

The local background estimation is particularly ef- 
ficient when the ratio of the number of point source 
pixels to background pixels is small. Additionally, the 
median filter window size (s) used for background com- 
putation should be significantly larger than the point 
spread function of the astronomical sources. Yet, the 
median (or mean) is sometimes not an optima l back- 
groun d estimation when the objects are large flPatat I 
2003). In astronomy, fortunately for our PHOTZIP al- 
gorithm, variability is prevalent only among small ob- 
jects always below the angular size of the point-spread 
function. 

A non-uniform unknown background is best de- 
termined locally. In pictures such as those taken by 
the Night Sky Live project, the background is highly 
non-uniform and usually unpredictable, making a lo- 
cal background estimation highly important. The local 
estimation also simplifies the usage of the algorithm, 
since one does not have to be aware of the type of the 
background of the frame when applying the compres- 
sion. Nevertheless, in cases a uniform background is 
desired, the t parameter describe in Section 3 can be 



set for that purpose. 

5 Performance of the PHOTZIP 
Algorithm 

PHOTZIP should be used along with a multi-purpose 
lossless compression utility such that the lossless utility 
is applied after running PHOTZIP. We tested PHOTZIP 
with two common compression utilities: gzip, which 
is based on Ziv fc L empel's compression algorithm 
(IZiv fc Lempel lll977T) . and bzip 2, which is based on 
Burr ows & Wheeler's algorithm llBurrows fc Wheeler I 
1994). These algorithms, like others, are based on find- 
ing and compressing repetitive patterns in the data. 
Additionally, quantization of the signal increases re- 
dundancy and allows multi-purpose c ompression algo- 
rithm a yet higher compression factor llYang fc Ki effcr 
l!99Sfl . 

The compression factor and the amount of lost sig- 
nal determine the utility of the algorithm. In order 
to test the algorithm, we used all-sky 1024 x 1024 
FITS images used by Night Sky Live all-sky mon- 
itoring network that deploys CONtinuous CAMeras 
(CONCAMs). We also tested FITS images taken by 
other optical instruments. In all cases, the algorithm 
provided a significantly better compression factor then 
all popular multi-purpose compression utilities alone. 
Table Q below presents the affect of the d, b, s, t pa- 
rameters on the performance of the algorithm in terms 
of compression factor, and compares it to compression 
factors of common lossless compression programs. The 
four rightmost columns are the compression factors 
when using PHOTZIP before applying gzip and bzip2, 
and when using gzip or bzip2 without using PHOTZIP. 
The compression factor used here is the amount of data 
that was compressed as a percentage of the size of the 
original frame. 

Figure 1 depicts the image file ci040325ut00 5115p.fit s, 
a typ ical FITS image from the NSL ijNemiroff et al. I 
2005) project. The picture is available in FITS and 
JPG formats at http://www.NightSkyLive.net The 
file n3166_lj.fits and n3184_lj.fits were taken from the 
galaxy catalog at http://www.astro.princeton.edu/~frei/catalog.htm 
The later is shown as Figures 2 and 3. 

As the maximum allowed absolute difference (b) in- 
creases, the compression factor is higher, but so is the 
signal loss. Since the tested frames contained mostly 
background, like many astronomical images, the value 
of b has a substantial affect on the compression factor. 
To test the utility of the b parameter, we set d to oo, 
effectively letting the algorithm quantize all the pixels 
and therefore achieving a higher compression factor. 
Although a better compression factor can be achieved, 
not using the d parameter can increase the bright sig- 
nal loss as discussed in Section 4. 

A higher t parameter increases the compression 
factor, but this parameter should be used with extra 
caution since it can also potentially cause signal loss 
of bright pixels. The s parameter normally does not 
have a significant affect on the compression factor, but 
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has some affect on the time required to compress an 
image. 

6 Conclusions 

When even a single pixel is quantized, photometry can 
be at least partially compromised. A background es- 
timated from the mean of surrounding pixels, how- 
ever, will usually end up closer to the original back- 
ground. We have shown that using even as little as 
1600 pixels for background estimation can reduce the 
background error in the mean of the quantized pix- 
els to below a single count. Practically, this means 
that the background estimation and hence photomet- 
ric measurements can be protected in the PHOTZIP 
compression \ decompression process. 

In sum, we present here a simple lossy compression 
algorithm for astronomical images. The main advan- 
tage of this algorithm is that it can preserve the sig- 
nal of bright pixels, while symmetrically losing signal 
only from the background. The criteria for preserv- 
ing a pixel's value is user-defined (in terms of a), so 
the user can accurately control the compression fac- 
tor/signal loss trade-off. In addition, the algorithm 
guarantees a user-defined maximum absolute differ- 
ence (also in terms of a), so the user can control the 
amount of lost signal for those areas in the frame that 
are not preserved. The algorithm was implemented 
and tested on unsigned 16-bit integer FITS images, 
but we believe that the same approach can be applied 
also to signed integer and floating point images. The 
algorithm was tested on some typical astronomical 16- 
bit integer FITS images and appeared to be effective. 
PHOTZIP is now in routine use on FITS data taken 
by the Night Sky Live project. 
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Table 1: Compression Factor 



File Name 


File Size 


d 


b 


s 


t 


photzip 


gzip 


photzip 


bzip2 














+ gzip 




+ bzip2 




n3166_lr.fits 


204 KB 


1 


1 


20 





76.6% 


50.3% 


81.9% 


60.1% 


n3166Jr.fits 


204 KB 


2 


1 


20 





79.7% 


50.3% 


86.4% 


60.1% 


n3166_lr.fits 1 


204 KB 


1 


2 


20 





80.3% 


50.3% 


84.4% 


60.1% 


n3166_lr.fits 1 


204 KB 


3 


6 


10 





98.6% 


50.3% 


98.6% 


60.1% 


n3166_lr.fits 1 


204 KB 


3 


3 


20 





91.7% 


50.3% 


92.6% 


60.1% 


n3166Jr.fits 


204 KB 


3 


1 


20 





82.8% 


50.3% 


86.4% 


60.1% 


n3166Jr.fits 


204 KB 


oo 


1 


20 





84.0% 


50.3% 


87.9% 


60.1% 


n3166_lr.fits 1 


204 KB 


oo 


2 


20 





91.8% 


50.3% 


92.7% 


60.1% 


n3184Jj.fits 


204 KB 


1 


1 


20 





75.0% 


50.5% 


81.3% 


55.5% 


n3184Jj.fits 1 


204 KB 


3 


1 


20 





82.8% 


50.5% 


87.1% 


55.5% 


ci040325ut005115p.fits 


2102 KB 


1 


1 


8 





66.7% 


38.1% 


73.6% 


54.1% 


ci040325ut005115p.fits 


2102 KB 


oo 


1 


8 





76.2% 


38.1% 


84.0% 


54.1% 


ci040325ut005115p.fits 1 


2102 KB 


1 


2 


10 





74.0% 


38.1% 


77.3% 


54.1% 


ci040325ut005115p.fits 1 


2102 KB 


2 


3 


10 





82.6% 


38.1% 


86.9% 


54.1% 


ci040325ut005115p.fits 1 


2102 KB 


2 


3 


10 


0.5 


88.7% 


38.1% 


91.1% 


54.1% 


ci040325ut005115p.fits 


2102 KB 


3 


1 


10 


0.5 


78.5% 


38.1% 


84.8% 


54.1% 


ci040325ut005115p.fits 


2102 KB 


3 


1 


10 





73.4% 


38.1% 


81.8% 


54.1% 


ci040325ut005115p.fits 1 


2102 KB 


1 


2 


10 


0.5 


81.5% 


38.1% 


85.2% 


54.1% 


ci040325ut005115p.fits 1 


2102 KB 


1 


3 


10 


0.5 


87.4% 


38.1% 


89.2% 


54.1% 


ci040325ut005115p.fits 1 


2102 KB 


1 


3 


10 





79.1% 


38.1% 


86.9% 


54.1% 


ci040325ut005115p.fits 1 


2102 KB 


1 


6 


10 


1.15 


92.1% 


38.1% 


93.2% 


54.1% 


ci040325ut005115p.fits 1 


2102 KB 


3 


6 


10 


1.15 


94.9% 


38.1% 


95.1% 


54.1% 




Figure 1: ci040325ut005115p.fits: A Night Sky Live all-sky picture 



http: / /www. NightSkyLive.net 
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Figure 2: n3166Jr.fits: A picture taken from the galaxy catalog. 
|http://www.astro.princeton.edu/^ ei/catalog.htm 
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Figure 3: n3184Jj.fits: A picture taken from the galaxy catalog. 
|http://www.astro.princeton.edu/~frei/ca talog.htm 



